Clustering High-Dimensional Data Using Evidence of Multimodality

نویسنده

  • Peter Hall
چکیده

We suggest a nonparametric approach to clustering very high-dimensional data, designed particularly for problems where the mixture nature of a population is expressed through multimodality of its density. In such cases a technique based implicitly on mode-testing can be particularly effective. In principle, several alternative approaches could be used to assess the extent of multimodality, but in the present problem the excess mass method has important advantages. The resulting methodology for determining clusters is particularly effective in cases where the data are relatively heavy tailed or show a moderate to high degree of correlation, or when the number of important components is relatively small. To request an interpreter or other accomodations for people with disabilities, please call the Department of Statistics and Probability at 517-355-9589.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

Feature Selection For High-Dimensional Clustering

We present a nonparametric method for selecting informative features in high-dimensional clustering problems. We start with a screening step that uses a test for multimodality. Then we apply kernel density estimation and mode clustering to the selected features. The output of the method consists of a list of relevant features, and cluster assignments. We provide explicit bounds on the error rat...

متن کامل

An Effective and Efficient Approach for Clusterability Evaluation

Clustering is an essential data mining tool that aims to discover inherent cluster structure in data. As such, the study of clusterability, which evaluates whether data possesses such structure, is an integral part of cluster analysis. Yet, despite their central role in the theory and application of clustering, current notions of clusterability fall short in two crucial aspects that render them...

متن کامل

A Novel Subsampling Method for 3D Multimodality Medical Image Registration Based on Mutual Information

Mutual information (MI) is a widely used similarity metric for multimodality image registration. However, it involves an extremely high computational time especially when it is applied to volume images. Moreover, its robustness is affected by existence of local maxima. The multi-resolution pyramid approaches have been proposed to speed up the registration process and increase the accuracy of th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010